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Abstract 

Precoding for multiple-input, multiple-output (MIMO) antenna systems is considered with perfect 
channel knowledge available at both the transmitter and the receiver. For 2 transmit antennas and QAM 
constellations, an approximately optimal (with respect to the minimum Euclidean distance between 
points in the received signal space) real-valued precoder based on the singular value decomposition 
(SVD) of the channel is proposed, and it is shown to offer a maximum-likelihood (ML)-decoding 
complexity of 0(y/~M) for square A/-QAM. The proposed precoder is obtainable easily for arbitrary 
QAM constellations, unlike the known complex-valued optimal precoder by Collin et al. for 2 transmit 
antennas, which is in existence for 4-QAM alone with an ML-decoding complexity of 0{M\/M) 
(M = 4) and is extremely hard to obtain for larger QAM constellations. The proposed precoder's 
loss in error performance for 4-QAM in comparison with the complex-valued optimal precoder is only 
marginal. Our precoding scheme is extended to higher number of transmit antennas on the lines of the 
E-d m i n precoder for 4-QAM by Vrigneau et al. which is an extension of the complex-valued optimal 
precoder for 4-QAM. Compared with the recently proposed X— and Y — precoders, the error performance 
of our precoder is significantly better. It is shown that our precoder provides full-diversity for QAM 
constellations and this is supported by simulation plots of the word error probability for 2 x 2, 4 x 4 
and 8x8 systems. 

Index Terms 

Diversity gain, low ML-decoding complexity, MIMO precoders, singular values, word error proba- 
bility. 

I. Introduction and Background 

Multiple-input, multiple-output (MIMO) antenna systems have evoked a lot of research interest pri- 
marily because of the enhanced capacity they provide, compared with that provided by the single antenna 
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point to point channel. Moreover, for a system with n t transmit antennas and n r receive antennas (n t x n r 
system), the maximum diversity gain (refer Section [EI] for a definition of diversity gain) achievable with 
coherent detection has been shown to be n t n r . For MIMO systems with the channel state information 
available only at the receiver (CSIR), suitably designed space-time block codes (STBCs) [1] provide 
full-diversity. Full-rate transmission is said to occur if n m i n = m\u{n t: n r ) independent information 
symbols are transmitted in every channel use. Full-rate STBCs achieving full-diversity have also been 
proposed [2], [3]. However, all full-rate, full-diversity STBCs are characterized by a high ML-decoding 
complexity (refer Section [EI] for a formal definition of ML-decoding complexity). In general, decoding 
full-rate STBCs requires jointly decoding ntn m i n symbols. 

MIMO systems with full channel state information at the transmitter (CSIT) or partial CSIT have 
been extensively studied in literature. From an information-theoretic perspective, capacity is an important 
parameter for MIMO systems and waterfilling [4] can be employed to achieve the capacity with a 
Gaussian codebook. From a signal processing point of view, the error performance of MIMO systems 
using finite constellations is one of the important parameters, and several precoding^] schemes have been 
proposed in this regard. Maximal ratio transmission was introduced in [5] to achieve full-diversity while 
maximizing the signal-to-noise ratio (SNR) by precoding at the transmitter and equalizing at the receiver 
for transmission of a single symbol per channel use. Subsequently, the use of precoding and equalizing 
matrices at the transmitter and the receiver, respectively, was proposed in [6] to maximize the SNR at 
the receiver, but this scheme resulted in low-rate transmission. Several works on optimal linear precoders 
and decoders have been done for the minimum mean square error (MMSE) criterion [7]-[10]. Since 
these precoders are linear and optimal for the MMSE decoding, the decoding complexity is very low 
and full-diversity is also achieved, but the error performance is worse than that for the ML-decoding. 
Other non-ML-decoding techniques include lattice-reduction based techniques [11] which provide full- 
rate transmission with possibly full-diversity, but lattice-reduction itself involves a high complexity for 
large MIMO systems. Extensive research has also been done on MIMO systems with limited feedback 
to the transmitter about the channel from the receiver (see, for example, [12] and references therein). In 
this paper, we consider MIMO systems with full CSIT. The channel state information could be either 
sent to the transmitter by the receiver (when there are separate frequency bands for uplink and downlink 
transmission) or the transmitter could estimate the channel, if it is reciprocal (like in a time division 
duplexing (TDD) system), by receiving pilot signals from the receiver. In literature, to the best of our 

'precoding is also referred to as "transmit beamforming". 
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knowledge, there is no known precoding technique to achieve all the three attributes - full-rate, full- 
diversity and low ML-decoding complexity ("low ML-decoding complexity" is a relative term and in this 
paper, it is used to mean the joint decoding of at most 2 complex symbols). 

Almost all the popular precoding techniques with ML-decoding at the receiver use the singular value 
decomposition (SVD) of the MIMO channel [13]. The E-d m i n precoder for 4-QAM [14], an extension 
of the complex- valued optimal precoder [15] to higher number of transmit antennas, has been shown 
to perform very well for 4-QAM, beating all other linear precoding and decoding schemes based on the 
MMSE criterion, and ML-decoding involves jointly decoding two complex symbols only. However, this 
precoder exists in literature for 4-QAM alone and is very hard to obtain for larger QAM constellations, 
since it involves a numerical search over 3 parameters. Recently, X- and Y- precoders have been proposed 
in [16] as rivals for the E-d m i n precoder. The X-precoder has been shown to offer an ML-decoding 
complexity of O(M) (this can be brought down to 0(\fM) by the same decoding scheme as for our 
precoder, which is explained in Subsection IIV-DI ). while the Y-precoder has an ML-decoding complexity 
which is invariant with respect to the constellation size M. The disadvantage with the X -precoder is 
that it loses out to the E-d m i n precoder in error performance for 4-QAM and it is not known if an 
explicit expression for the precoding matrix can be obtained for larger QAM constellations. The Y- 
precoder (which uses a two-dimensional constellation), although explicitly obtainable for constellations 
of any size M, loses out in error performance to the E-d m i n precoder, since it has not been optimized 
for error performance. In literature, all the aforementioned low ML-decoding complexity precoders have 
been claimed to offer a diversity gain of (n t — n m i n /2 + l)(n r — n m i n /2 + 1) by the authors (but the 
simulation results in this paper indicate that the E-d m i n precoder has full-diversity for 4-QAM). Concerned 
by the limitations of each of the low ML-decoding complexity precoders, we first propose a real-valued, 
approximately optimal precoder (we explain in Section JV] why the precoder is "approximately optimal") 
based on the SVD of the channel for n t = 2 and then extend it to higher number of transmit antennas, 
an approach similar to that in [14]. The ML-decoding complexity offered by our precoder is shown to be 
0[\/M) for M-QAM. For 4-QAM, the proposed precoder has only a marginally poorer error performance 
than the E-d m i n precoder, but has lower ML-decoding complexity. For larger QAM constellations, it is 
easily obtainable, unlike the E-d m i n precoder. When compared with the X- and Y-precoders, it has a 
much better error performance. The main contributions of the paper are - 

throughout this paper, unless otherwise stated, optimalily is with respect to the minimum Euclidean distance between points 
in the received signal space. 
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1) we propose a novel scheme to obtain an SVD-based, real-valued, approximately optimal precoder 
for 2 transmit antennas and any M-QAM. The method of obtaining this precoder is different from 
the one taken to obtain the complex-valued optimal precoder for 2 transmit antennas [15], and is 
easily applicable for any Af-QAM, unlike that in [15]. 

2) We extend this real-valued precoder to higher number of transmit antennas and show that our 
precoding scheme offers full-diversity with ML-decoding. This is a new result as the existing low 
ML-decoding complexity precoders have been claimed to offer a diversity gain of only (n t — 
nmin/2 + — + 1). The simulation plots of the word error probability for 2 x 2, 4 x 4 
and 8x8 systems support our claims about full-diversity. 

3) The ML-decoding complexity of the proposed precoder is shown to be 0(yM) for square Af- 
QAM, in general. However, for a considerable number of channel realizations, no search is required 
over the Af signal points. Specifically for 4-QAM and 2 transmit antennas, simulations reveal that 
for more than 50% of the channel realizations, no search is needed over any of the signal points. 
This aspect is elaborated in Subsection IIV-DI 

The rest of the paper is organized as follows. Section [TT] gives the system model, the relevant definitions 
and some known results which are needed for our precoder design. A brief review of existing low ML- 
decoding complexity precoders is given in Section [III] The method to obtain the proposed precoder is 
presented in Section JV] and its ML-decoding complexity is analyzed in Subsection IIV-DI In Section 
Ivl we show how this precoding scheme can be extended to higher number of transmit antennas while 
Section [VT] deals with the achievable diversity gain with the proposed precoder. Simulation results are 
given in Secion IVIII and concluding remarks constitute Section I VIII I 

Notations: Throughout, bold, lowercase letters are used to denote vectors and bold, uppercase letters 
are used to denote matrices. For a complex matrix X, the Hermitian, the transpose and the Frobenius 
norm of X are denoted by X^, X T and ||X||, respectively. The i th element of a vector x is denoted by [x]j, 
the (i,j) th entry of X is denoted by X(i, j), tr(X) denotes the trace of X, and X = diag(xi, X2, • • • ,x n ) 
implies that X is a diagonal matrix with x\, X2, ■ ■ • , x n as the diagonal entries. The set of all real numbers, 
complex numbers and integers are denoted by R, C and Z, respectively. The real and the imaginary part 
of a complex-valued vector x are denoted by x/ and xq, respectively, \x\ denotes the absolute value of a 
complex number x and |<S| denotes the cardinality of the set S. The T xT identity matrix and the nxm 
sized null matrix are denoted by It and O nxrn , respectively. For a complex random variable X, K[X] 
denotes the expectation of X, while X ~ J\fc (0, 1) implies that X has the complex normal distribution 
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with zero mean and unit variance. Unless used as a subscript or to denote indices, j represents \f—l 
and for a function f(x), argmin/(:r) and argmax/(:c) denote that value of x which minimizes and 

X X 

maximizes f(x), respectively. For any real number m, [m\ denotes the largest integer smaller than m, 
\m\ denotes the smallest integer larger than m, rnd[m] denotes the operation that rounds off m to the 
nearest integer and sgn(m) gives the sign of m, both of which can be expressed as 

{\m\ , if [m] — m > m — \ m\ 1, if m > 

, sgn(m) = < 

\m\, otherwise —1, otherwise. 

The Gamma function and the Q-function of x are denoted by T{x) and Q(x), respectively, and given as 

pOO pOO -I 2 

= / e-H x - x dt, Q(x) = / ^=e-^dt. 



'o Jx V 2vr 

Let f(x) and g(x) be two functions. Then, f(x) = O {g (x)) if and only if there exists a positive 
constant c < oo such that 

Urn !{X) ~ ~ 



and f(x) = o (g(x)) as x — > a if and only if 

lim M = 0. 

x~>a g(x) 

For a real variable t, the unit step function u(t) is defined as u(t) = 1, if t > 0, and u(t) = 0, if t < 0. 

II. System Model 

We consider an n t x n r MIMO system with full CSIT and CSIR. The channel is assumed to be 
quasi-static and flat with Rayleigh fading. The channel is modelled as 



where y G C nrXl is the received vector, H G C" rXrit is the channel matrix, s G C n * xl is the precoded 
symbol vector and n G C™ rXl is the noise vector. The entries of H and n are i.i.d. circularly symmetric 
complex Gaussian random variables with zero mean and variance 0.5 per real dimension. In £[]), the 
scalar SNR is the average SNR at each receive antenna, and s is constrained such that E[£r(ss^)] = n t . 
The precoded symbol vector s can be defined as 

s 4 -UlVIx, 



E 
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where M G C"' XTlm! " is the precoding matrix, with ||M|| 2 = n t , and x = [x±,X2, ■■■ ,x„ rai J T is the 
symbol vector, with its entries taking values independently from a signal constellation denoted by A, 
having an average energy of E units. The rate of transmission is n m i n independent symbols per channel 
use. Note that in this model, the variable scalar which defines the average SNR at each receive antenna 
is SNR, while E is a constant. For example, for a standard M-QAM, with M = 2 2a for some positive 
integer a, E = 2(M - l)/3. 

Let H = UDV^, obtained on the SVD of H, with U G C nrXTtr and V G C n * xnt being unitary matrices. 
D G R n " xn * is such that D = [Di O rerX(nt _ nr) ] if n t > n r and D = [Di O ntX(lir _ n / if n t < n r , 
where Di G ]R n "«" xri "»« is a diagonal matrix given by Di = diag(<7i, a^, • • • , (j„ mm ), with a\, 02, • • • > 
(Jn min being the non-zero singular values of H, placed in the descending order on the diagonal. Let the 
precoding matrix M be given as 

M = VP, (2) 
where P G C ntXn """. Now, (Q} can be written as 



where y' = U^y and n' = U^n, with the distribution of n' being the same as that of n. 

The ML-decoding rule seeks to find that x G A nminXl which minimizes the metric given by 

. 2 

m(x) 



/ SNR 



(4) 



Clearly, the error performance of the system depends on the choice of P and A. From (0, it is evident 
that the design of the precoding matrix M amounts to designing P. Henceforth in this paper, P is referred 
to as precoder and the constellation is assumed to be an A/-QAM, where M = 2 2a for some positive 
integer a. 

Definition 1: (Full-diversity precoder) In a MIMO system, if at a high SNR, the average probability 
P e that a transmitted symbol vector is wrongly decoded is given by 



P e w (G C .SNR) 



-G d 



where w stands for "is approximately equal to", then, Gd and G c are called the diversity gain (or diversity 
order) and the coding gain of the system, respectively. For a MIMO system with precoding, if Gd = ntn r , 
then, we call the precoder a full-diversity precoder. 
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Definition 2: (ML-Decoding complexity) The ML decoding complexity is measured in terms of the 
number of computations involved in minimizing the ML-decoding metric given in © and is a function 
of the constellation size M. If at most k symbols are required to be jointly decoded, the ML-decoding 
complexity is said to be 0(M k ). 

Note that the above definition of the ML-decoding complexity is with respect to the worst-case ML- 
decoding complexity. The use of a sphere decoder [17] can effectively result in a much lower average 
ML-decoding complexity that depends on the dimension of the sphere decoder and not on the constellation 
size [18]. For a complex lattice constellation of size M, if the ML-decoding complexity is O (M fc ), the 
dimension of the real-valued sphere decoder to be used would be 2k. As a result, a precoding scheme with 
higher worst-case ML-decoding complexity than another precoding scheme will also have higher average 
ML-decoding complexity. Hence, throughout this paper, we consider only the worst case ML-decoding 
complexity. 

We make use of the following known results, which are needed for our purpose. 



Theorem 1: [19] For a scalar channel modelled by y = VSNR(3x+n, where n ~ Mc (0, 1), E[|x| 2 ] = 
1 and a = |/3| 2 is a nonnegative random variable whose probability density function (PDF) f a (a) is such 
that 

f a (a) = cot + o(a t ), as a — > + , 
the average symbol error probability (SEP) P e , which is given by 

P e = E[P e)Ce ] = J Q (y/kaSNRj f a da, 

is such that as SNR — > oo, 

where k is a fixed positive constant depending on the constellation, c is another constant defining the 
marginal PDF of a and P e _ a = Q (^VkaSNR^j is the a dependent instantaneous SEP. If E[a] = 1, then, 
SNR is the average SNR at the receiver and the diversity gain Gd and the coding gain G c can be defined 

as 

f ^cTit + DY^ 1 

Given that cxj, i = 1, 2, • • • , n m j„, are the non-zero singular values of H, it is known that af are the 
non-zero eigenvalues of HH^, which are denoted in the descending order by \, i = 1,2, ■■■ ,n m i n . 
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The following theorem gives the expression for the first order expansion of the marginal PDF of Aj as 

Xi -> 0+. 

Theorem 2: [20] Let the entries of the n r x m matrix H be i.i.d. complex Gaussian with zero mean 
and unit variance. The first order expansion of the marginal PDF of the k th largest eigenvalue A& of 
the complex central Wishart matrix HH^ is given by f\ k (\k) = afc^ fe + ° f^fe fc )> as -^fc — ^ + , A; = 
1, 2, • • • , n m i n , with dk = (n t — k + l)(n r — k + 1) — 1 and being positive constants. 

In ©, if P = l nt or P = [I nT 0„ rX „ t _ nr ] T , depending on whether n min = n t or n min = n r , 
respectively, each of the symbols x^, i = 1, 2, • • • , n m j n will experience a diversity gain given by = 
(n t — i + l)(n r — i + 1). This is evident from Theorem Q] and Theorem |2l The above operation of 
premultiplying the symbol vector by VP, with P = l nt (for n t < n r ) or P = O nrXnt - nr ] T (for 
nt > n r ) can be viewed to result in n m i n virtual subchannels. So, the overall diversity gain for the 
symbol vector is mii\{G di , i = 1, 2, • • • , n min } = (n max - n min + 1), where n max = max(n t , n r ). This 
is the least diversity order one can obtain in a precoded MIMO system with ML-decoding. However, 
assuming that the symbols take values from an arbitrary signal constellation of size M, the ML-decoding 
complexity is 0{M), since each symbol can be decoded independently from the others. 

Let Ax = x - x', where x,x' £ # ro "' xl . 

Theorem 3: [21] For P such that [PAx]i ^ for any non-zero value of Ax G {x-x'jxjx' G j^i mm xi^ 
the diversity gain of the system is ntn r . 

Proof: The instantaneous probability that a transmitted symbol vector x is falsely decoded to some 
other vector x' is given by 



Pr{x^x'} = Q\J^\\BP(x^)\\) . (5) 

Let e m i n = minAx{|[PAx]i|}, with Ax / O llmmX i. So, the probability P e (x) that a transmitted vector 
x is falsely decoded is upper bounded as 



Pe(x) < U\ nmm - l )Q[\j^^ U ny (6) 

where D(l, 1) = a±, the largest singular value of H. Assuming that all the symbol vectors taking values 
from J[ n ^^ xl are equally likely to be transmitted, the average instantaneous word error probability 
(WEP), dependent on D is given by 
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Using © in ©, 

P e ,D < (\A\ n - - 1) Q U^a^) = (\A\ n - - 1) Q (J \ 1 SNRj , 

where Ai = a\. So, from Theorem [T] and Theorem |2l the average WEP P e as SNR — >■ oo is given by 

P e < C.SNR- ntn " + o (SNR- ntn ") , (8) 

where 



C = (\A\ n --l) 



ai(2n t n r - l)(2n t n r - 3) • • • 1 ( e 2 min 



2 \ -n t n r 



2njn r \2ntE y 

with ai being a positive constant such that fx^Xi) = aiA™'™"" 1 + o (A" tnr_1 ) as Ai ->■ + . Note that 
in obtaining C, we have used the fact that F(t+ 1) = tF(t) and T(l/2) = ^/tt. Since t m in > 0, C < oo 
and from ([8]), the diversity gain achieved by the system is ntn r . ■ 
An alternative proof of Theorem [3] has been presented in [21]. Since the steps of our proof are used in 
Section [Vl] of this paper, and also for the sake of completeness, we have provided our version of the 
proof. 

Note: The condition that [PAx]i ^ for any non-zero value of Ax G {x-x'|x,x' G ^4™"»™ x1 } [ s only 
sufficient to guarantee full-diversity. There might be several precoders which do not satisfy this condition 
but still give full-diversity. This will be elaborated in Section |VT] Also note that in Theorem [3j the 
constraint is only on the first entry of PAx. The other entries are allowed to be zeros. 

Obtaining P such that e m j n ^ is not difficult. Choosing P to be [G O nrX ( nt _ nr )] T (for n t > n r ) or 
G (for n t < n r ) for QAM constellations, where G G M n ^" xn ^^ is the rotated Z" mi ™ lattice generator 
matrix with a non-zero product distance, as presented in [22], ensures that the diversity gain is n t n r . If 
A is a square QAM constellation of size M, the ML-decoding complexity is O {m -2 ^^, since all the 
n m i n independent symbols are entangled in the decoding metric, but the real part of the symbol vector 
can be independently decoded from the imaginary part. This is possible because G is real- valued. In [21], 
complex-valued precoders are used to achieve full-diversity and they offer an ML-decoding complexity 
of C(M n -»"). 

III. Review of Low ML-decoding complexity Precoders 

This section gives a brief overview of existing low-complexity precoders. The first precoder is called 
the E-d m i n precoder [14], which is an extension of the MIMO precoder for m = 2 [15], developed for 
4-QAM. 
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A. E-d m i n precoder 

The precoder P of size n m j n x n m i n (for n t > n r , the remaining n t — n r rows of P are zeros) has the 
following structure 



Mi(l,l) 



Mi (1,2) 



M 2 (l,l) 



M 2 (l,2) 



Ml*(l,l) Mn^M) 

M ~.i, (2,1) M - in (2,2) 



M 2 (2,l) 



M 2 (2,2) 



Mi (2,1) 

where, if ji = tan -1 ( cr "'"'"~' +1 ) is such that < ji < j Q , then, 



Mi (2, 2) 



n. 



. / 3-V3 -,-7r/12 



and if 7 G < 7i < 7r/4, 



M, 



n t T? 



n. 



COS V'i 

sin ipi 



1 e i7r / 4 
-1 e^/ 4 



where, 



/^2-l\ / /3V3-2./6 + 2^/2-3 

V^i = tan , 70 = tan 



\ cos7i 



3^/3-2^/6 + 1 



0.3016 



(9) 



(10) 



and n = J<h** (pf 5( 7 *) -^J , with Pi = ^ a 2 + o^.^ 



and 



(1 - cos 2 7j-, if < 7j < 7 

(4-2V2)cos^sin^ otherwise . 
l+(2-2 v / 2) cos 2 7j ' 



The precoder essentially entangles the virtual subchannels with index i and n m , n — i + 1, i = 1,2, 
• • • j^mm/fi Such a scheme will have an ML-decoding complexity of O (My/M\ It has been shown 
that the scheme guarantees a diversity gain equal to (n t — + l)(n r — + 1). Also, the precoder 
is optimal among precoders based on the SVD of the channel for n m i n = 2 and 4-QAM [15]. 



3 for odd valued n m in, is replaced by [ ""''" j and the ([ ""'" j + l) 1 subchannel is left unpaired. 
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B. X-precoder 

The X-precoder has the same structure as in (©, with the matrices Mj given as 



M, 



COS I 

sin i 



- sm( 
cos ( 



where, for 4-QAM, 



tt/4, 



tan 



l I 1— tan 2 7; — -y/l+tan 4 ■yi— 3 tan 2 74 
tan 2 7; 



if 74 > 7r/3 
otherwise. 



This scheme has also been shown to guarantee a diversity gain equal to (n t — + l)(n r — + 1), 
but has an ML-decoding complexity of O (\/M) only (refer Subsection IIV-DI for details). However, it 



is expected to lose out in performance for 4-QAM when compared with the E-d m i n precoder, since it is 
not optimal. Also, an explicit expression for the precoder when M > 4 does not exist. 



C. Y-precoder 

The y-precoder [16] has the ^-structure but it uses a displacement vector and its precoded symbol 



vector s can be written as 



s = V(Px + u) 



(11) 



where, u is the displacement vector. The precoded vector can also be expressed as 

S = VP e//Xe//, 

where, Px + u = P e jfX e ff, with ¥ e ff and x e ff being the effective precoder and the effective symbol 
vector, respectively. These are defined as 



where, 



(a>i,bi) 



3n t 







n r (M 2 — l) 

n t a. I n t 

3n r (/3 2 +M') ' n r (/3 2 +M') 



if A 2 > 



M 2 -l 



otherwise, 
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and M' 



M 2 -l 



and fa 



. The constellation A G Z 2xl of size M is two-dimensional with 



the signal vectors (not to be confused with the symbol vector ~x. e ff) z;, / = 1, 2, • • • , M defined as 



z/ 



2Z - M - 1 
(-1)' 



and the symbol vector that is associated with the (i, n r 



i + subchannel pairing is Sj 



iV„ in -i+i> with Vj,V nmin _ i+ i G A i = 1,2, 

X e// = Ml' Ml, • ■ ■ , 



,n r 



i/2. Hence, 



]i,[s: 



M2, [Sl]2 



So, the effective precoder of the Y-precoder is a diagonal matrix, while P, as given in (ITTb . has the 
'Y' structure. The Y-precoder has been shown to have better error performance than the X -precoder 



for "ill-conditioned" channels, i.e., for low values of 



-, i 



1,2, ■■■ ,n m i n /2, while for well- 



conditioned channels, the X -precoder has better error performance. However, the Y-precoder has lower 
ML-decoding complexity, which is 0(1). Hence, among all existing precoders, the Y-precoder has the 
least ML-decoding complexity while the E-d m i n precoder has the best performance for 4-QAM. 

IV. SVD-based, Approximately Optimal, Real- valued Precoder for n t = 2 

In this section, we propose a real-valued precoder for 2 transmit antennas and QAM constellations. The 
precoder is approximately optimal among the SVD based real-valued precoders for QAM constellations. 
The primary advantage of this precoder over the complex- valued optimal precoder [15] is that it is much 
easier to find the entries of the precoder for larger constellations, since it has only 2 parameters that need 
to be searched for, while the complex-valued precoder has 3 parameters. Without loss of generality, we 
consider 2 receive antennas and 2 transmit antennas, for which D in Q can be expressed as 



D 



P 



cos 7 
sin 7 



where p = ^ o\ + a\ and 7 = tan Clearly, < 7 < 7r/4. Let 

Emin (P) 



min {llDPAxll 2 , Ax G {x - x' I x, x' G A 2xl \ 



(12) 



From ((5]), the optimal precoder is given by P opt = argmax{£' m j n (P)}, which may or may not be unique. 

p 

In [15], P ojrt G C 2x2 was obtained for 4-QAM as follows. Using SVD, P G C 2x2 can be written as 
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P = ASB , where A is a unitary matrix of size 2x2 and 

1 


For QAM constellations, because of the symmetry associated with the constellation, < 9 < 7r/4, 
< ip < vr/2 and < <p < n/2. It was shown in [15] that A can be taken to be identity without affecting 
the optimality. Using numerical search, the optimal values for 9, ifi and cj) were found out for 4-QAM. 
However, there are two major obstacles when this method is used for larger QAM constellations. Firstly, 
numerical search becomes practically hard for larger constellations due to the fact that there are three 
parameters to be searched for. Secondly, numerical searches do not give a closed form expression for the 
optimal angles and the method employed in [15] to obtain closed form expressions for the optimal angles 
for 4-QAM is not amenable for application to larger QAM constellations. Due to these limitations, we 
look for a real-valued optimal precoder which also naturally offers lower ML-decoding complexity (this 
is elaborated in Subsection IIV-DI ). A real-valued precoder can be expressed as P(V>, &) = ASB T where 
A can be taken to be identity without affecting optimality and 

cos 9 sin 9 
— sin 9 cos 9 

Note that there are only two parameters to be searched for. Our approach towards finding the optimal 
precoders is also based on numerical search, but the method to obtain closed form expressions for the 
optimal angles is novel and easily applicable for any M-QAM. However, since this method is based on 
numerical search, it is not known if the angles are exactly optimal. Finding the exactly optimal values of 
9 and Tp as a function of 7 involves an exhaustive search over the range of 9 and rjj, which is practically 
impossible. However, a numerical search, with 9 and tp varying in very small increments, gives the values 
of 9 and ip, which we denote by 9* and ip* , respectively, such that E min {Y{ip* ,9*)) is nearly equal to 
E m i n (P opt ), with P opi being the optimal real-valued precoder. For this reason, we call our precoder 
approximately optimal. 

A square QAM signal set (not necessarily Gray coded) of size M is given by 

Am-qam = {a + jb I a,b G A^_ PAM }, (14) 

where A ^/jj_ pam = {2i — \[M — l,i = l,2, ■■■ , a/M} is a PAM constellation of size y/~M. Let 



£ = V2 



cos ip 
sin^ 



B 



cos f — sm ( 
sin 9 cos I 



S = V2 



cos ijj 
s'mip 



B 
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cos 9 — sin 9 
sin 9 cos 9 

and 

<5(7,.4) = max { min { ||F( 7 , V, 0)Ax|| 2 , Ax G |x,x' I x,x' G „4 2xl ) ) 1 , (15) 
(ip,e) [Ax|Ax^o 2xl J 

where, for our numerical search, we take ip = A.k, k = 1, 2, • • • , I ^rj , 9 = A.k, k = 1, 2, • • • , I ^rj , 
with A being the increment size, taken to be 0.001 radians for our searches. Let 

(V>*,0*) = argmaxi min { ||F(7, V, 6»)Ax|| 2 } i . (16) 

We note that for M-QAM, E min (P{r, 9*)) = 2p 2 5 (~/,A M -QAm) = 2p 2 5 {l,A y/M _ PAM ). Hence, 
we only need to search for 9* and tp* for which 5(7, A ^rjj_ PAM ) is obtained. Note that this simplification 
of the search to only a ^/M-PPM is possible since F( 7 , tp, 9) is real-valued. This is another huge advantage 
over the complex-valued precoder, which does not enjoy this benefit. Henceforth, 9* and ip* are used to 
denote the approximately optimal angles of 9 and ip. Due to our choice of the increment size, one can 
safely say that (E min (P opt ) - E min (P(9* < K.E min (P opt ), where k is a very small fraction of 
the order of 10 -3 . 

The search results reveal that 9* as a function of 7 can be written as 

n 

V* = Y1 °*k ("(7 - 7fc) " "(7 - 7* " w*)) , (17) 
k=l 

where 9? , k = 1, ■ ■ ■ , n, are constants, n is the finite number of different values 9* takes, 7^ is the value 
of 7 at which 9* changes from 9 k _ x to 9* k , with j[ = 0, 9q = 0, w k = j' k+l - j' k and 7' +1 = tt/4. The 
search results also reveal that ip* cannot be expressed as a weighted sum of shifted step functions and 
hence a closed form expression needs to be obtained analytically. To obtain this, we first obtain 9* as 
follows. 



F( 7 ,V,0) 



cos 7 
sin 7 



cos ijj 
sin t/j 



A. Calculating 9* 

For M-QAM, in order to obtain 9* and V*> a $ given by (fT6l) . the entries of Ax take values from 
|2 (-\fM + i\ ,i = 1,2, ••• ,2VM - l}. Let p, q G {-\^M + i,i = 1,2,- •• ,2\/M- 1} 



be such that 



4||F( 7 ,^,r) [p qf\\ 2 = S^Am-QAm) = S(l,A 



M-PAMr 



(18) 
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The numerical searches done for 5 QAM constellations - 4-/16764-/256-/1024-QAM reveal that 

1) there are two distinct (p,q) pairs for which ( fT8l ) is satisfied when < 7 < 7 2 , where 7 2 is as 
defined in (Q/7]). These are (0, 1) and (1, y/M - 1). Also ijj* = in this range of 7. 

2) There are three distinct (p, q) pairs for which (TT~8T > is satisfied when 7^ < 7 < 7^, +1 , k = 2, • • • , n. 
Let 

e(p, g, 0* , V*) = c °s 2 7 c °s 2 (^* ) cos (0* ) - g sin (6* ) ) 2 + sin 2 7 sin 2 (V>* ) (q cos (0* ) + p sin (0* ) ) 2 . 
So, for < 7 < 7 2 , we have 



e(O,l,0J,O) =e(l,VM- 1,01,0), 
solving which we obtain 0f = tan -1 -7=. The other solution, which is 0| = tan" 1 f ^= - ) , is ruled out 



since it has been observed that E min (P [0, tan" 1 \1/Vm) J J > £ min (P (O, tan" 1 (l/(y/M - 2) 
for < 7 < 7 2 . For 7^ < 7 < 7^ , v k = 2, • • • , n, we have 

where (pi, gi), (P2, ^2) and (p3, 53) are the three pairs for which (fl~8T > is satisfied. Solving them, we arrive 
at 

tan 7 tan V = 1 + 7 > ; , 7 2 on 9 ^ I 7 2 2\ ■ 2 i a *\ > ( 19 ) 

(P2?2 - Pigi) sin (20£) + (g| - g?) cos 2 (0*) + (p 2 - p 2 ) sm 2 (0*) 

tan 7 tan Wj 1 + (p 3 g 3 - Pift) sm (20*) + (g 2 - g 2 ) cos 2 (0*) + (p| - pf) sin 2 (0*) ' ^ 
Equating (fl~9l and (l20b . we obtain 

(aid 2 - a 2 di) tan 2 (0£) + 2(ai& 2 - a 2 6i) tan (0£) + aic 2 - a 2 c\ = 0, (21) 

where ai = pf + gf - p 2 , - gf , 61 = p 2 g 2 - pigi, ci = g| - g 2 , d x = p| - p\, a 2 = pf + q\ - p\ - g§, 

.2 „2 „„j j _ „2 ^,2 



^2 = P3?3 — Pi9i> c 2 = g 3 — g^ and d 2 = p\— p\- Equation (1211 has been observed to have only one 
solution in the range (0,7r/4). This solution gives Q* k . 

B. Calculating ijj* 

As mentioned before, ip* = for < 7 < 7 2 . In order to obtain ip* for 7^ < 7 < 7j. +1 , fe = 2, • • • , n, 
we note from ( fl9l ) and (1201 ) that tan 2 7 tan 2 (■(/;*) is constant in that range of 7 and hence, 
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r = tan" 1 (^-) , (22) 
\ tan 7 J 

where A k is given by the R.H.S of ^ (or (T20b). 

C. Calculating 7^ 

Having obtained #* and we proceed to find the exact values of 7!, fc = 2, ■ ■ ■ , n as follows. 
For convenience, let i'*{9l,l) — V 7 * (given by (1221 )) for 7^ < 7 < l k+1 , k = 1, • • • ,n. Since 0* is 
discontinuous at 7JJ., where it makes a transition from 9 k _ 1 to 0£, k > 2, we have 

where the pairs q k -i) and (p fc , satisfy (dD for t^_ 1 < 7 < 7^ and 7^ < 7 < 7^, +1 , respectively. 

So, we have 



cos 



2 (vc^uy*)) = cos 2 (rm,i' k )) ( fl c + + /* d 6 ) 



sin 2 (Wk-irfk)) = ^ (V>*(«)) (^T 1 ) (^4^5) > ( 24 ) 

where a= (p fe _icos (fl^) - q k -i sin (^_ 1 )) 2 , 6= (%_icos {9* k _- i ) + sin (^x)) 2 , 
c = (p k cos(6l) — q k sm(6l)) 2 , d = (q k cos(9l) + p k s'm(9l)) 2 , and as explained in Subsection IIV-BI 
Afc_i and are constants given by A k _i = tan 2 7tan 2 {ip* (Q k _i, 7)) for J k _i < 7 < 7JJ., Afe = 
tan 2 7 tan 2 {ip* (9* k ,j)) for 7^, < 7 < 7^. +1 . Solving ^3) and (|24l . we obtain, 



Using and $25$), 

-if 

Ik = tan 



tan(^* {9H k ))j " 
The value of 5(7, *4m-QAm) as defined in (Tf5l) for 7^, < 7 < 7^ +1 is given by 

5(7, ^m-qam) = sin 2 7 ( Afc ^ t tj 7 ) > (26) 
where c = (p k cos(9* k ) - q k sin (9* k )) 2 , d = {q k cos(6* k ) + p k sin (6* k )) 2 ', with (p k ,q k ) any of the (p,q) 



pairs satisfying (U81) . 

Table H presents the values of 9* for different values of 7 for 4-QAM, 16-QAM, 64-QAM, 256-QAM 
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and 1024-QAM. The value of the constants tan 7 tan (ijj*) and the corresponding pairs (p,q) for which 
(fT8T ) is satisfied are also tabulated. Except for the case of 4-QAM, the values presented in Table U are 
the approximately optimal values rounded off to the fourth decimal. This has been done since it is very 
cumbersome to express them in the exact form. All angles are expressed in radians. Noting the values 
of 8* for 4-QAM, it is natural to believe that the angles tabulated are optimal for 4-QAM. Also, it can 
be noted that for every subsequent larger constellation, 9* differs from its corresponding values for the 
lower-sized constellation only at low values of 7, meaning which the numerical search need not be done 
over the entire range of 7 as the size of the constellation increases. The plots of 5(7, Am-QAm) as a 
function of 7 for the different unnormalized QAM constellations are given in Fig. Q] The curves for 256- 
and 1024-QAM appear to coincide, since they differ only at extremely low values of 7. In Fig. |2j the 
plots^ of 5(7, A) for the E-d m i n precoder, the proposed precoder, the X -precoder and the Y-precoder 
are given for M = 4 with the same power constraint for all the precoders as for our precoder. As was 
expected, the E-d m i n precoder has the best values of 5(7, A) over the entire range of 7 while our precoder 
has better values of 8(7, A) than the X- and Y-precoders. Fig. [3] and Fig. [4] show the plots of 8(7, A) 
for our precoder, the .X" -precoder and the Y -precoder for M = 16 and M = 64, respectively. For the X- 
precoder, the plots were obtained using numerical searches to obtain the approximately optimal angle for 
each value of 7 in the range (0, 7r/4), with 7 increasing in step sizes of 0.001. Note that for low values of 
7, our precoder and the Y-precoder have identical 8(7, A), which is because both transmission schemes 
are effectively the same in this range of 7. With an increase in the constellation size, the Y-precoder has 
increasingly lower values of 5(7, A) than that of our precoder and the X -precoder at higher values of 7. 
It is also clear from the plots that the Y-precoder is expected to have better error performance than the 
X -precoder only for ill-conditioned channels, i.e., for low values of 7. 

D. ML-decoding complexity 

We make use of the following lemma to analyze the ML-decoding complexity of our precoder. 

Lemma 1: For symbols x\ and X2 taking values from Am-QAm, the symbol ax\ + 6x2 takes values 
from Am 2 -qam if a = \/M, b = 1 or b = \[M, a = 1. 

Proof: Firstly, Am -QAM represents the standard, unnormalized M-QAM constellation, as given in 
(fl4l) . Let \f~MA m-qam denote the Af-QAM constellation scaled by \f~M. So, the distance between any 

4 In all the plots, the E-d m i„ precoder, our precoder and the X -precoder use A/-QAM, while the Y-precoder uses a two- 
dimensional codebook of size M, as defined in [16]. 
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two adjacent signal points on the same vertical or horizontal line of V MAm-QAM is 2yM. Now, the 
constellation given by 

A = U/Mxx + x 2 I x 1 ,x 2 £ Am-qam} (27) 



can be viewed to be obtained by replacing every element of V MAm-QAM by the entire constellation 
Am-QAM such that the origin of Am-QAM is the signal point being replaced. Hence, A has M 2 signal 
points and a QAM structure, and the distance between adjacent points on the same vertical or horizontal 
line is 2. Therefore, A is an A/ 2 -QAM. ■ 

The following theorem gives the ML-decoding complexity of the precoder. 

Theorem 4: For the proposed precoder, the following claims hold. 

1) The ML-decoding complexity is 0(\/M), when 7^ < 7 < 7^ +1 , k = 2, 3, • • • , n. 

2) The ML-decoding complexity is the same as that of a real scalar channel when < 7 < ^' 2 , with 
no exhaustive search over all the signal points required. 

Proof: These claims are proved below. 
Case i : 7fc < 7 < j' k+1 , & = 2, 3, ■ ■ ■ , re. 
In this case, the decoded signal vector x is 



argmm 



X M-QAM 



SNR_ 



2Em 



DPx 



argmm < 

X £A M _r) A h 



SNR, 
2Em 



Rx 



(28) 



where y', D and P are as defined in ©, Em = 2(M — l)/3 is the average energy of an M-QAM and 
y" = Q T y', with Q and R obtained on the QR-decomposition of DP. Since D and P are real- valued, 
d28l ) can be written as x = xj + jxn, where 



argmm 

x,e^ 2xl 



M-PAM 



ii 



SNR, 
2Em 



Rxi 



argmm 



M-PAM 



SNR, 
2Em 



with 



y" = y? + jy'q = [y'li + jy'i Q , v'ii + jy'iof » x - x / + i x Q = [*u + j^iq^ + j^q} 1 ■ 



To obtain xj, instead of using a 2-dimensional real sphere decoder, we do the following. For each 
possible value of x 2 i £ A jj^_ pam , the corresponding value of xu is evaluated as 



xii 



min ^max | 



u + 1 



l,-\/M + l ) ,Vm-i 



(29) 
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where 



2Em „,// 
SNR 



y , { I -R(l,2)x2i 



R(l,l) 

and x/ is given by that (a;ij,X2j) pair that minimizes 



2Em 



Rx, 



So, there are only vM searches (for \/M possibilities for involved in minimizing the ML-metric. 
The operation shown on the R.H.S of d29l) quantizes x\i to its nearest possible value for a fixed X21. 
This is made possible due to the structure of M-QAM which is a Cartesian product of two vM- 
PAM constellations. The same method can be applied to obtain xq. So, the ML-decoding complexity is 

o(Vm). 

Case 2: < 7 < 7 2 . 

From Tableland also as was pointed out earlier, for < 7 < 7 2 , ^* = and 6* = tan -1 (-1=\ This 
means that transmission is made only on the first virtual subchannel and the received signal of interest, 
with regard to ©, can be expressed as 

y[ = ax' + n[, 



where n' x is the first element of n', a = y/a\SNR/((M + T)Em) and x' = \f~Mx\ + x?,, where x\ and 
X2 take values from Am-qam- From Lemma [T] x' takes valued from Am 2 -QAM- So, in the first step, 
x' is decoded to obtain x' = x\ + jx'n by quantizing, where x\ and x'q are given by 



*5 



in 



|max 


2. rnd 


|max 


2. rnd 



+ 1 



1,-M + l ,M-1 



1,-M + l ,M - 1 . 



From x', x\ is decoded to obtain x\ = xu + 3%iQ, with xu and xiq given by 



xu = sgn(x 7 ) 2 



.?:, 



2VM 



1 , ii Q = sgn(£' Q ) 2 



2VAf 



(30) 



From a bit error rate point of view, it is advisable to transmit a symbol x\ alone on the first virtual subchannel, with xi 
taking values from a Gray coded M 2 — QAM. This is because the constellation given by j27l will not be Gray coded. However, 
with a view of minimizing the word error rate, transmission of x = v~Mxi+X2, with xi and X2 taking values from M— QAM, 
is as good a strategy as transmitting Xi alone, with xi taking values from a Gray coded M 2 — QAM. 
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and X2 is decoded to obtain ±2 = £21 + jx2Q, with x 2 i and ±2Q given by 

X21 = x'i - VMxu, x 2 q = x'q - VMxiq. (31) 
Note that the operations shown in (l30b and (l3ll together perform the inverse of the function given by 

f(xil, X 1Q ,X2I,X 2 q) = VM(x u + jxiq) + (x 2 i + JX2q) 

for xu, xiq, X21, X2Q G A ^/jrj_ PAM - Therefore, decoding x\ and X2 requires no exhaustive search over 
the M signal points of the constellation. ■ 
It has to be pointed out that the advantage of not having to search over any of the signal points when 
< 7 < 72 is unique to the proposed real-valued precoder and not obtainable for the case of the complex- 
valued optimal precoder [15] for 4-QAM, for which the effective constellation when < 7 < 0.3016 
appears like a it/ 12 rotated QAM constellation (it is not exactly a rotated QAM constellation, however. 
Hence, when, < 7 < 0.3016, even the sphere decoder cannot be used, since the effective constellation 
is not a lattice). 

V. Extension for n t > 2 

For the case of two transmit antennas, it is possible to obtain SVD-based, approximately optimal 
precoders (complex-valued precoder for 4-QAM, real-valued precoder for any M-QAM). Such precoders 
are defined by two or three parameters, depending on whether the precoder is real-valued or complex- 
valued, respectively. However, such an approach cannot be taken for the case of nt > 2, since, even for 
n t = 3, an optimal precoder would be defined by as many as 5 parameters, ruling out the possibility of 
a computer search even for 4-QAM. So, a more practical way of obtaining a precoder with a reasonable 
error performance is to pair the i th and the (n m j n — i + l) th subchannels along with the i th and the 
{n m in — i + l) th symbols, i = 1, 2 • • • , n m j n /2 and use the precoding scheme for 2 transmit antennas 
for this pair. This method of pairing has been shown to be the best in [14] and has also been adopted 
in [16]. The precoder would then have an 'X' structure, as in ©. For the i th subchannel pairing, 
7i = taxT x (a nrnin - i+1 / Gi), pi = yj of + &n min -i+i and 

yji , Am -QAM ) = sin 7, — — 5 — ' ( 32 ) 

\A ki + tan z 7i J 

where and A ki are as defined in the previous section without the subscript i (refer to (|26l ) and 
depend on 7; and M. Proceeding on the lines of the proof of Theorem |3l the instantaneous WEP P e j) 
is upper bounded as 
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Pe,D < (M n " 




(33) 



where E M = 2(M- l)/3 and d min = min AXiAx ^ „ m „ iXl ||DPAx||, with Ax G {x-x'|x,x' G 

and P being the precoder with the i th and n m i n — i + l th subchannels paired using the proposed precoding 

scheme described in Section [TV] So, 



mm < p. 




2^6^, Am-qam) I 



with S^ijAm-QAm) given by (l32l ). Observe that the scaling factor of 2n t /n m i n has been used to take 
into account the constraint that ||P|| 2 = n t . Since the values of S^jAm-QAm) are known, we can 
enhance the error performance of the precoder by pre-multiplying the precoding matrix with a power 
control matrix T = diag(ri, T2, • • • , Tn mi „/2i r n mi „/2; > r 2, T\) such that 



TiPiS(ji,AM-QAM)=v 2 , Vie jl,2,-- - ,^p} , 



where 77 is a constant and the power constraint on T is such that ||T|| 2 
this power constraint, from (l34l) . we obtain 



n mi „/2 r 2 



(34) 

n min . Due to 



n. 



n mi „/2 



-1 



\ 2pfS(-fi,A M -QAM) y p{ Pj3(lj > Am—QAm) 



where 5(7i, Am-QAm) is obtainable from (l32l ). Hence, the proposed precoder has the structure given in 
©, where 

cos tpi cos Oi — cos tpi sin B\ 
sin sin 6i sin ^ cos 0j 



71, 



with 7/)j and 0j being the approximately optimal values obtainable from (|2TI ) and (l22l . respectively, 
both depending on 7$ and M. For example, for a 4 x 4 system using 4-QAM signalling, if, for some 
channel realization, = tan~ x f 2t J = 1 and 72 = t 



tan 



1 / 0-3 



2a = tan" 



| ) , then, from Table 



HI 01 = tan _1 (l/2), = 0, 2 = vr/4, ^2 = tan" 

1+3 tan 2 72 



(1+3 tan 2 7 2 )?? 2 2 



2(cr 2 +cr 2 )sm 2 72 ' 



+ 



(<r 2 +cr 2 )cos 2 71 ^ 2(<7 2 +<7 2 )sin 2 72 



\/3 tan (72) 



and n 



5r£j 

(cr 2 +o- 2 ) cos 2 71 
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The upper bound on the instantaneous WEP is now given as 

Pe,D < (M n — - 1) Q [x fJ^Z V ) , (35) 

where 

V 




rw/2 1 



\ 2 \ J^i Pj 5 (lj^M-QAM) 



It can easily be checked that 




SNR 

u>min 




2n t EM I V V n minEM 

and hence, the upper bound in 051 ) is lower than that in 03T ). Therefore, the use of the power control matrix 
enhances error performance. Note that the symbols of each subsystem can be decoded independently 
from the symbols of the other subsystems. Hence, the ML-decoding complexity offered by our precoding 
scheme is 0(y/~M). 

A similar approach of using a power control matrix has been taken in [14] for 4-QAM, but since 
we need to have explicit values of <5(7i, Am-QAm), applying this scheme for the E-d m i n precoder with 
larger constellations is not feasible. Structurally, the E-d m i n precoder and the X -precoder differ from © 
in that for the E-cZ m j„ precoder, Mj is optimized using an additional parameter <pi (as shown in (fT3l)). 
while for the X-precoder, Mj is optimized with Tj = 1 and tpi = 7r/4. Table HE] gives a comparison of 
the various low ML-decoding complexity precoding schemes. 

VI. Diversity gain 

The E-d m i n precoder, the X-precoder and the y-precoder have all been shown to guarantee a diversity 
gain equal to (rit — + l) (n r — + l) . Recall that the condition in Theorem[3]is only a sufficient 
condition for achieving full-diversity gain equal to n t n r . It is not necessary that P be such that [PAx]i ^ 0, 
for Ax € {x-x ; , x,x' G A n ° , ™ x1 }. This can be seen by noting that for n t = 2 and 4-QAM, our precoder 
does not satisfy the condition when 9* = 7r/4, but still gives full-diversity. This is proved in the following 
lemma. 

Lemma 2: The proposed precoder offers full-diversity, i.e., a diversity gain equal to 2n r for n t = 2. 
Proof: Consider the precoder given by 

cos (0.5 tan -1 2) - sin (0.5 tan" 1 2) 
sin (0.5 tan" 1 2) cos (0.5 tan" 1 2) 
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which is the full-diversity rotation matrix [22] in 2 dimensions and has the highest non-zero product 
distance among all 2 x 2 sized orthogonal matrices. This precoder, which we call the lattice precoder 
for nt = 2, has full-diversity from Theorem [3j Clearly, £(7, Am-qam) for any value of 7 is greater 
for our precoder than that for the lattice precoder, since our precoder is approximately optimal among 
real-valued precoders. So, our precoder has better error performance than the lattice precoder. Hence, our 
precoder too offers full-diversity, like the lattice precoder for n t = 2. ■ 
From Lemma |2j Theorem [2] and Theorem [3l one would be inclined to believe that for n t > 2, a 
subsystem with index i, for which the i th and the n m j„ — % + 1 th virtual subchannels are paired and the 
i th and the n m ,j n — i + 1 th symbols precoded by the scheme proposed in Section [IV] has a diversity gain 
of (n t — i + l)(n r — with i = 1, 2, ■ ■ ■ , n m i n /2, in which case the diversity gain of the whole system 
would be the minimum of the diversity gains of all the subsystems, i.e., (nt— n m i n / 2+1) (n r — n m j n /2+l). 
In fact, the diversity gains of systems using the E-d m i n precoder and the X -precoder have been claimed to 
be (nt — n m i n /2+l)(n r — n m j n /2+l) due to this reason. It must be noted that the power control matrix T 
plays an important role in the error performance of our precoder (also the E-d m in precoder for 4-QAM), as 
explained in Section [V] Before we analyze the achievable diversity gain of the system with the proposed 
precoding scheme, the following important observation needs to be made about #(7*, Am-QAm)- Since 
(7i > (T2 > • • • 0"n mi „ . we have ° n ™ in < a ""^"~ 1 < ■ ■ ■ an J ' +1 . Consequently, 

tan- 1 < tan- 1 (^f 1 ) < • • • tan" 1 f^g^j 

and therefore, 71 < 72 ■ ■ ■ < 7n mi „/2- From Fig. [U except for the case of 4-QAM, we can conclude 
that (j(7j, Am-qam) < ^(IjiAm-QAm)-, f° r i < 3- Due to this fact, although it is expected that for 
1 < i < j < nmin/2, p\ > pj, it is not guaranteed that pfS(ji, Am-QAm) > Pj^ilji Am-qam), 
due to which even without the use of T, the overall diversity gain of the system might be higher than 

(nt /2 + l)(n r 

— n m i n /2 + 1) (this holds true even for the -precoder). With the use of X for our 
proposed precoder, the channel dependent instantaneous WEP is dependent on n, as seen in d35l) . Let 

^ - I 

Pi\/$ (71 j Am-qam) 

and be the probability that £ < 1. 

In Table |III1 we tabulate the values of Cmm> which is the minimum value of ( obtained on simulations 
for 10 channel realizations, and P^, which is again calculated by simulating 10 7 channel realizations, 
for different MIMO systems. In the table, we observe that for n t = 16, 32 and for M > 64, ( is always 
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greater that 1. This can be attributed to the fact that for higher values of n m i n , the ratio of a Umin to a± 
is very low and the corresponding value of 5 (71 , A.m—QAm) is also very low. For such systems, we can 
safely say that the full-diversity gain equal to n t n r is achieved (since a\ is associated with a diversity 
gain of n t n r ). For other systems, the simulations results in Table [III] seem to indicate that there exists a 
(min > such that (mm < (, i.e., C is lower bounded by (min- So, from (I35T ). 

Pe,D < (M n """ - 1) Q 



'SNR.CI^p^Am-qam) 



nminEM 

Let 6m = min{<5(7i,^4Af_Q J 4jv/)}, which is a constant depending on M. Then, 



PeD < (AT*— - 1) Q I .j^™* < (m— - 1) Q ' ■ l SNR -&*n°Z 6 " 



(M n — - 1) g 




TlminEM I \ V TlminEM 



■2 



AiSiVii ] , 



where, as used throughout the paper, Xi = af. From Theorem Q] we obtain, as SNR — >• 00, 

P e < C.SNFT ntnr + o (SNR-^) . 

where 

C = (M n """ - 1) a i( 2w *"r-l)(2w t n r -3)---l / SmCLu \ ""'^ (37) 

2n t n r \n min E M J 

with ai being a constant in the expression for the marginal PDF of Ai, as defined in Theorem[2] Therefore, 
the overall diversity gain of the system is n t n r . Note that in 071 ), 8m and ( m in define the coding gain 
- the higher the value of Cmin and 6m, the better the error performance. It is not known if Cmin can be 
obtained analytically. The values in Table [III] are only indicative of what the actual Cmin is likely to be. 
For example, for the 16 x 16 system with 64-QAM, Cmin is likely to be greater than 1. Thus, we have 
shown that our precoding scheme provides full-diversity. This claim is supported by the WEP plots for 
different MIMO systems, shown in the following section. 

VII. Simulation results 

For all simulations, we consider the Rayleigh fading channel with prefect CSIT and CSIR. We consider 
three MIMO systems - 2 x 2, 4 x 4 and 8x8 MIMO systems. For the 2 x 2 MIMO system, the rival 
precoders for our precoder are the E-d m in precoder and the X-precoder. We have left out the y-precoder 
since it has been shown in [16] to have an error performance comparable with that of the X-precoder for 
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4-QAM, while for 16-QAM, it is not expected to beat the .X" -precoder, as can be inferred from Fig. [3] The 
constellations employed are 4-QAM and 16-QAM. For 16-QAM, the E-d m i n precoder is not considered 
since it is very hard to obtain and not explicitly stated in literature. For the X -precoder, we have obtained 
the approximately optimal angles for 16-QAM using a numerical search for 7 = k.A, k = 1, 2, • • • , [^J , 
A = 0.001, and have used a look-up table to obtain the appropriate angle for the corresponding value 
of 7 during simulations. A look-up table is necessary since the approximately optimal angle for the X- 
precoder is not a weighted sum of shifted step functions like that for our precoder. Fig. [5] shows the plots 
of the word error probability (WEP) as a function of the average SNR at each receive antenna for the 
2x2 system. As expected, the E-d m i n precoder has the best error performance for 4-QAM, marginally 
beating our precoder, which in turn significantly beats the X -precoder. For 16-QAM, our precoder beats 
the X -precoder by about 1.5dB at an SNR of 30dB. 

For 4x4 and 8x8 systems, we also consider the Lattice precoder, which is the orthogonal matrix 
with the largest known non-zero product distance for n m i n = nt real dimensions, and given explicitly in 
[22]. This precoder has been shown in Theorem [3] to offer full-diversity. The plots of the WEP for the 
4x4 system and the 8x8 system are given in Fig. [6] and Fig. |7J respectively. The plots indicate that the 
E-d m i n precoder and our proposed precoder offer full-diversity, since they beat the full-diversity achieving 
Lattice precoder (even the X -precoder appears to offer full-diversity, losing out in coding gain only. The 
explanation for this has already been given in Section |VI]). Our precoder significantly outperforms the 
X -precoder while having lower expected ML-decoding complexity (as shown in Theorem HJ), while the 
E-d m in precoder has the best error performance for 4-QAM, marginally beating our precoder, but this 
is at the expense of ML-decoding complexity. In Table |IVJ by simulating 10 6 channel realizations, we 
have tabulated the probability that ML-decoding can be done without searching over any of the signal 
points for 4- and 16-QAM. It can be noted that for the 2 x 2 MIMO system with 4-QAM, for more 
than 50% of the channel realizations, no search over any of the signal points is required, while for the 
4x4 and the 8x8 MIMO systems, half the number of subsystems do not require any search over the 
constellation points for more than 99% of the channel realizations. This advantage, however, diminishes 
with the increase in constellation size. 

VIII. Discussion 

For systems with full CSIT, we have proposed a real-valued precoder for n t = 2, which, for QAM 
constellations, is approximately optimal among all real-valued precoders based on the SVD of the channel 
matrix and has an expected ML-decoding complexity lower than 0(y~M). The advantage of the proposed 
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precoder over the E-d m i n precoder is that it is much easier to obtain for larger QAM constellations and 
it also has lower ML-decoding complexity, while the loss in error performance for 4-QAM is only 
marginal. The proposed precoder handsomely beats the X -precoder in error performance while having 
lower expected ML-decoding complexity. A precoding scheme for m > 2 is also given and this scheme is 
shown to offer full-diversity with QAM constellations. It would be interesting to design low ML-decoding 
complexity, full-rate, full-diversity precoders for more realistic scenarios, like for systems with imperfect 
CSIT or partial CSIT. 
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TABLE I 

Approximately optimal values of 6 and iji for various QAM constellations 
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* when 7i < 7 < "y' 2 , for the E-d m i n precoder, a full search over all the signal points is needed, 

while for the proposed precoder, no search over signal points is needed. 
¥ amounts to storing the near-optimal angle values for 7 = kA, k = 1, 2, • • • , where A is a 

suitable step size. 
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TABLE III 

Characteristics of £ for different MIMO systems 
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TABLE IV 

Probability that no search is required for each subsystem of different MIMO systems 




Fig. 1. 5(7, Am-qam) as a function of 7 for the proposed precoder for various QAM constellations 
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Fig. 3. 5{"f,A) comparison for \A\ = 16 
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Fig. 5. WEP comparison for 2 x 2 MIMO systems for 4-QAM and 16-QAM 
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Fig. 6. WEP comparison for 4 x 4 MIMO systems for 4-QAM and 16-QAM 
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Fig. 7. WEP comparison for 8 x 8 MIMO systems for 4-QAM and 16-QAM 
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